BEST: Exploratory analysis

The BEST project aims to understand what are the behavioural responses of people who rely on natural resources for their livelihood, and whose resources could potentially present threshold like dynamics, aka. “regime shifts”. For this study we designed a game played with fishermen in four communities of the Colombian caribbean. Here I present a short exploration of the game data and survey.

Where?

The game was performed in four coastal communities of the Colombian Caribbean. Taganga is traditional fishing town that in the last decades has become more of a turistic hub for backpackers. Fishing is still important in the area and is mainly done in the sea, sometimes on the marine territory of Tayrona National Park. Tasajeras is a small fishing community located on the road that communicates Santa Marta and Barranquilla, two major cities. Their location favors fishing both in the sea and inside the Cienaga Grande de Santa Marta, a wetland complex of brakish water, mangroves and a very special ecosystem that depends on the exchange of salt water from the sea and fresh water from the Magdalena river. The inhabitants of Buenavista literally live on the Cienaga. Their houses stand just above water on an area of the Cienaga that is flooded all year round. They are the poorest community we visited and they depend completely on explotation of natural resources for survival. Las Flores, in contrast, is actually a neighbourhood of Barranquilla located in the outskirts of the city. They fish on the Magdalena river’s mouth with very special techniques that use wind and kites to fish on the sea. They also exploit resources in neigbouring wetlands formed by the delta.

We designed the experiment to test for differences on treatment, but also to test for difference on ecological conditions, expecting that communities who rely more strongly on the Cienaga (a closed system that has undergone regime shifts in the past) would behave differently that communities exposed more to oceanic conditions; where variability has perhaps masked/buffered regime shifts.

load('~/Documents/Projects/BEST - Beijer/BEST/mapBEST.RData')
ggmap(map) + 
  geom_point(aes(x=lon, y=lat, colour='Orange', size=2), data=coords, alpha=0.5, 
             show.legend = F) +
  geom_text(aes(x=lon, y=lat), data= coords ,
            label=c('Taganga', 'Tasajeras', 'Buenavista', 'Las Flores'), 
            size=4, colour='blue', nudge_y = 0.03) + ggtitle('Study area')

A graphical exploration of the survey

We ran a survey with each of the 256 participants of our fishing game. The 56 questions of the survey are grouped around five topics:

  1. the game
  2. fishing activities
  3. changes in resources and traditional ecological knowledge
  4. cooperation and community activities
  5. demographic information at household level

For each section I present a series of graphic summaries for questions whose answer was binary or numeric. Later I elaborate on more detailed information for text data. The version of this graphical summary is a lazy version. I’ve written a function that works on any question of the survey but does not modified the x-axis labels for any figure. This allow us to identify preliminary problems with data, e.g. wether it has been errors on digitalizing the raw data. However, unless is explicitely said on the figure label, 0 means no and 1 means yes.

1. The game

Note that Q3 has an error, it should be binary 1/0 and has values of 2. Most people was not expecting the end of the game and wanted to play at least 5 more rounds. That’s good as control for end of game effect.

q1 <- question (surv, q1=372, q2=25, q3=2, fun=mean) + ggtitle('Q2. Life satisfaction\n 1= very satisfied : 4 = very unsatisfied')
q2 <- question (surv, q1=372, q2=26, q3 = 2, fun=mean) + ggtitle('Q3. Have you participated in economic\n experiments before?')
q3 <- question (surv, q1=372, q2=27, q3 = 2, fun=mean)  + ggtitle('Q4. Did you played with any of your\n fishing partners?')
q4 <- question (surv, q1=372, q2=28, q3 = 2, fun=mean)  + ggtitle('Q5. Were you surprised at the end?')
q5 <- question (surv, q1=372, q2=29, q3 = 2, fun=mean)  + ggtitle('Q6. How many extra rounds were you expecting?\n 0 = none; 1 = <5; 2 = >5')
grid.arrange(q1,q2,q3,q4,q5, nrow=1, ncol=5)

2. Fishing activities

Most people started fishing as teenager, however it varies depending of place. For future analysis a density plot might better reveal such differences. We have errors on data input for questions 8 & 10. Most people have been fishermen all then time since they started, and most of them have also been fishing on the same community. For less than half of our sample, people stop fishing at least a month last year (2015)

grid.arrange(
  question (surv, q1=372, q2=31, q3 = 2, fun=mean)  + ggtitle('7. How old did you start fishing?'),
  question (surv, q1=372, q2=32, q3 = 2, fun=mean)  + ggtitle('8. Do you fish most of the time?'),
  question (surv, q1=372, q2=33, q3 = 2, fun=mean)  + ggtitle('9. Have you been fishing here\n since you started?'),
  question (surv, q1=372, q2=35, q3 = 2, fun=mean)  + ggtitle('10. Last year, there were months\n when you have not fished?'),
  nrow=1, ncol=4
)
## Warning: position_stack requires non-overlapping x intervals

## Warning: position_stack requires non-overlapping x intervals

I’d like to get a cleaner version of the fishing calendar (e.g. a heatmap), but for now the figure below summarizes when were people non-fishing last year. Note is a crappy graph, not all plots are on the same scale… but aesthetics apart, one can see that not surprisingly the month when people stop fishing more often is December, probably also due to festivities.

grid.arrange(
  question (surv, q1=372, q2=36, q3 = 2, fun=mean)  + ggtitle('January'),
  question (surv, q1=372, q2=37, q3 = 2, fun=mean)  + ggtitle('Februrary'),
  question (surv, q1=372, q2=38, q3 = 2, fun=mean)  + ggtitle('March'),
  question (surv, q1=372, q2=39, q3 = 2, fun=mean)  + ggtitle('April'),
  question (surv, q1=372, q2=40, q3 = 2, fun=mean)  + ggtitle('May'),
  question (surv, q1=372, q2=41, q3 = 2, fun=mean)  + ggtitle('June'),
  question (surv, q1=372, q2=42, q3 = 2, fun=mean)  + ggtitle('July'),
  question (surv, q1=372, q2=43, q3 = 2, fun=mean)  + ggtitle('August'),
  question (surv, q1=372, q2=44, q3 = 2, fun=mean)  + ggtitle('September'),
  question (surv, q1=372, q2=45, q3 = 2, fun=mean)  + ggtitle('October'),
  question (surv, q1=372, q2=46, q3 = 2, fun=mean)  + ggtitle('November'),
  question (surv, q1=372, q2=47, q3 = 2, fun=mean)  + ggtitle('December'),
  nrow=2, ncol=6
)

The questions about fishing effort are not well coded. In question 16 a & b both . and , were used to designate decimals which results on errors and not plot shown. We also have an outlier regarding earning in Colombian pesos, who said that on a bad day will make 3 * 106 which is highly unrealistic. Errors aside, most people fish 6 days a week and the most common fishing day is of 12 hours, followed by working days of 10 and 8 hours. Although the graphs for earnings are less than suitable here (re-scale and use density instead of geom_bar) the minimum daily wage on an average day is Col$ 500 and the median is Col$ 310^{4} which is less than US$10. On a bad day most people don’t earn anything and the median is 2000 which is less than 10Sek. Most people have bad days once or more than once a week.

grid.arrange(
  # repeat graph with position = 'dodge' in geom_bar
  question (surv, q1=372, q2=49, q3 = 2, fun=mean)  + ggtitle('14. Days fishing in a normal week'), 
  question (surv, q1=372, q2=50, q3 = 2, fun=mean)  + ggtitle('15. Fishing hours in a normal day'),
  #question (surv, q1=372, q2=51, q3 = 2, fun=mean)  + ggtitle('16a. Kg of fish in a normal day'), ## error!!! summary(surv[,51]) reveals there is . and , used for decimals
  #question (surv, q1=372, q2=52, q3 = 2, fun=mean)  + ggtitle('16b. Earnings in Col$ in a normal day'), ## error summary(surv[,52]) # one of the datapoins is 3 million pesos per day!
  question (surv, q1=372, q2=53, q3 = 2, fun=mean)  + ggtitle('16c. Kg of fish in a good day'), # error
  question (surv, q1=372, q2=54, q3 = 2, fun=mean) 
    + geom_histogram(aes(alpha=0.2)) + ggtitle('16d. Earnings Col$ in a good day'), # error
  question (surv, q1=372, q2=55, q3 = 2, fun=mean)  
    + ggtitle('16e. Kg of fish in a bad day'), # error
  question (surv, q1=372, q2=56, q3 = 2, fun=mean) + geom_histogram(aes(alpha=0.2)) + ggtitle('16f. Earnings Col$ in a bad day'), # error in distribution table(surv[,56])/16
  question (surv, q1=372, q2=57, q3 = 2, fun=mean)  
    + ggtitle('16f. How often do you have a bad day?\n 1 = 1/year, 2 = 1/month, 3 = 1/week, 4 = n/week'),
  nrow=2, ncol=4
)

Most people fish in company of other fishermen, but almost a fifth of our surveyed fishermen fish alone. Crew size varies across places, very few people fish with crews > 12, and they are from Taganga or Las Flores, the fishing towns where open sea fishing with more industrial styles are possible. In fact, the maximum crew size for Tasajera is 5 and in Buenavista is 3 or 4. Note we have an error on the raw data for Buenavista datapoints. Most people fish with the same crew, but most of the time the decision on when to go fishing is made by other person than the intervieweed. The most typical boat is bote which implies a small size vessel without engine or with a small power engine. Only few people (<20) fish on lanchas which are bigger boats with more powerful engines, most of them are located in Las Flores where wind and currents are strong due to the Magdalena river delta. Similarly, around 60% of fishermen are not the captain neither own the boat they fish with. They don’t fish on the same places either. Half of the people own their fishing art being ownership most common in Buenavista and less common in Taganga.

grid.arrange(
  question (surv, q1=372, q2=58, q3 = 2, fun=mean)  + ggtitle('18. Do you fish with someone\n else?'),
  question (surv, q1=372, q2=59, q3 = 2, fun=mean)  + ggtitle('18.1 How often?\n 1=rare, 2=1/2times, 3=most, 4=always'),
  question (surv, q1=372, q2=60, q3 = 2, fun=mean)  + ggtitle('18.2 How many are you?'), # repeat this plot with density table(surv[,60])/16
  question (surv, q1=372, q2=61, q3 = 2, fun=mean)  + ggtitle('18.3 Same crew?'),
  question (surv, q1=372, q2=62, q3 = 2, fun=mean)  + ggtitle('18.4 Who decides when to fish?\n 0=me 1=some else'),
  question (surv, q1=372, q2=64, q3 = 2, fun=mean)  + ggtitle('19. Do you fish by boat\n or lancha?'),
  question (surv, q1=372, q2=65, q3 = 2, fun=mean)  + ggtitle('19.1 Are you the captain?'),
  question (surv, q1=372, q2=66, q3 = 2, fun=mean)  + ggtitle('19.2 Do you own the boat?'),
  question (surv, q1=372, q2=67, q3 = 2, fun=mean)  + ggtitle('20. Do you fish in the same place?'),
  question (surv, q1=372, q2=89, q3 = 2, fun=mean)  + ggtitle('23. Do you own the fishing art?'),
  nrow=2, ncol=5
)

Where does the food goes? Most fishermen take some fish home but they usually sell more than half and ocassionally give some away. When asked if they imagine themeselves being fishermen in 10 years most people think they will be. Question 28 is horribly coded, it should be between 0:4 but has values >5 up to 10! There is also typing errors in Tasajera where many values were included in answers that should have been unique.

grid.arrange(
  question (surv, q1=372, q2=90, q3 = 2, fun=mean)  + ggtitle('24. How much do you take homes?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=91, q3 = 2, fun=mean)  + ggtitle('25. How much do you sell?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=93, q3 = 2, fun=mean)  + ggtitle('26. How much do you give away?\n 0=none, 1=some, 2=half, 3=>half, 4=all'),
  question (surv, q1=372, q2=93, q3 = 2, fun=mean)  + ggtitle('27. Do you think you will be fishermen in 10yrs?\n 0=NO, 1=no, 2=yes, 3=YES, 4=dont know'),
  question (surv, q1=372, q2=96, q3 = 2, fun=mean)  + ggtitle('28. Do you think your children will fish?\n 0=NO, 1=no, 2=yes, 3=YES, 4=dont know'),
  nrow=1, ncol=5
)

3. Changes in resources & TEK

When we asked about species dynamics people reported that there has been speces that they don’t fish as much as before. More than 150 respondants report this changes as dramatic. However, this part of the survey is horribly coded. There are errors in data for Tasajera for questions about when the species dissapeared or if is still missing, values should have been numeric but it was not coded that way. Errors lead to no plots produced.

grid.arrange(
  question (surv, q1=372, q2=98, q3 = 2, fun=mean)  + ggtitle('31. Have you been fishing the same spp?'), # should be binary
  question (surv, q1=372, q2=100, q3 = 2, fun=mean)  + ggtitle('33. Is there any species that you dont fish as much as before?'), # error
  question (surv, q1=372, q2=102, q3 = 2, fun=mean)  + ggtitle('35. Dramatic changes?'), # error
  question (surv, q1=372, q2=104, q3 = 2, fun=mean)  + ggtitle('35.3. Sp1 When?'), ### Errors, horribly coded! summary(surv[,104])
  question (surv, q1=372, q2=105, q3 = 2, fun=mean)  + ggtitle('35.4. Sp1 How long, still missing?'),
  question (surv, q1=372, q2=109, q3 = 2, fun=mean)  + ggtitle('35.3. Sp2 When?'),
  question (surv, q1=372, q2=110, q3 = 2, fun=mean)  + ggtitle('35.4. Sp2 How long, still missing?'),
  question (surv, q1=372, q2=114, q3 = 2, fun=mean)  + ggtitle('35.3. Sp3 When?'),
  question (surv, q1=372, q2=115, q3 = 2, fun=mean)  + ggtitle('35.4. Sp3 How long, still missing?'),
  question (surv, q1=372, q2=119, q3 = 2, fun=mean)  + ggtitle('35.3. Sp4 When?'),
  question (surv, q1=372, q2=120, q3 = 2, fun=mean)  + ggtitle('35.4. Sp4 How long, still missing?'),
  nrow=2, ncol=6
)

Responses on abrupt events on fishing: In question 35 people were asked if they have experienced an abrupt change (above graph). Questions 36:40 were only asked to people who did experience abrupt changes, so N != 256. People were asked if the abrupt event described on table/question 35 have affected what they fish. Most people answer positively, nevertheless about 40 people say ‘no’ (Q36 error with 10 in Buenavista where it should have been 1/0). Since the occurrance of the event, most people spend more time fishing, less than 20 spends the same or less time respectively. Just above 60 people changed their fishing area while just below 50 didn’t (Q36.3 error non-binary value in Las Flores). Most people expected abrupt changes in the future. Question 39 was a multiple choice answer type, but only to ~50 respondents were read the options out loud. In case of an abrupt event fewer people would continue fishing, most of them won’t increase effort, but they won’t reduce their fishing time neither, less than 40 would consider changing fishing grounds or fishing arts, and most of them wont stop fishing. A lot of questions in this section have coding errors for data in Taganga with contradictory values for the same individual, or with values of 2 when it should be 1/0.

grid.arrange(
  question (surv, q1=372, q2=123, q3 = 2, fun=mean)  + ggtitle('36. Did the dramatic event\n change your catch?'), # error with a 10 in Buenavista
  question (surv, q1=372, q2=125, q3 = 2, fun=mean)  + ggtitle('36.2 Do you spend more time fishing\n 0=less, 1=more, 2=same?'),
  question (surv, q1=372, q2=126, q3 = 2, fun=mean)  + ggtitle('36.3 Did you changed your\n fishing area?'),
  question (surv, q1=372, q2=127, q3 = 2, fun=mean)  + ggtitle('37. Do you expect other changes?'),
  question (surv, q1=372, q2=129, q3 = 2, fun=mean)  + ggtitle('39.1 What would you do if there\n is abrupt reduction of fish\n Where the question options read out loud?'),
  question (surv, q1=372, q2=130, q3 = 2, fun=mean)  + ggtitle('39.2 Continue fishing'),
  question (surv, q1=372, q2=132, q3 = 2, fun=mean)  + ggtitle('39.3 Increase effort'), # error in taganga
  question (surv, q1=372, q2=134, q3 = 2, fun=mean)  + ggtitle('39.4 Less fishing'),# error in taganga
  question (surv, q1=372, q2=136, q3 = 2, fun=mean)  + ggtitle('39.5 Change fishing area'),# error in taganga
  question (surv, q1=372, q2=138, q3 = 2, fun=mean)  + ggtitle('39.6 Change fishing art'), # error taganga
  question (surv, q1=372, q2=140, q3 = 2, fun=mean)  + ggtitle('39.7 No more fishing'), # not binary
  nrow=2, ncol=6
)

4. Cooperation and community

Most fishermen shared their fishing arts, only ~100 belong to a fishing coop. The community with stronger coop presence is Las Flores (remember the local contact was the coop leader!) while the community with less cooperative participation is Buenavista. People participating in coops usually meet once a month followed by twice a year. Surprisingly, most participants self-report to be treasurers (but there is only one treasurer per coop!?). The graphs about the reasons to belong to a coop doesn’t make much sense. Future versions should compare (bars) for positive answers, here I’m not sure if negative was coded as default or that they did not perceive the benefit. For example, for ‘better prices’ only 20 people report that benefit while ~80 said ‘no’ to that question. Here 0 means that they don’t get the benefit, or that the option was not mentioned? In any case, the most popular positive answers were about recognition as organizaton member and social aspects. Errors: In Q40 is not binary, error in Taganga data; Q42.3 has errors due to coding, for questions where time was asked interviewers were not consistent on the way they wrote down answers. Some times they used years, months, numbers, text and for now we cannot extract the info automatically.

grid.arrange(
  question (surv, q1=372, q2=143, q3 = 2, fun=mean)  + ggtitle('40. Share fishing art'), # error not binary
  question (surv, q1=372, q2=145, q3 = 2, fun=mean)  + ggtitle('42. Belongs to coop?'),
  question (surv, q1=372, q2=147, q3 = 2, fun=mean)  + ggtitle('42.2 How often do you meet?\n 1=1/yr, 2=2/yr, 3=1/month, 4=1/week'), # not binary
  question (surv, q1=372, q2=148, q3 = 2, fun=mean)  + ggtitle('42.3 Since when?'), # error due to coding summary(surv[,148])
  question (surv, q1=372, q2=149, q3 = 2, fun=mean)  + ggtitle('42.4 What is your role?\n 1=representant, 2=president, 3=secretary, \n 4=treasurer, 5=member, 6=other'),
  question (surv, q1=372, q2=152, q3 = 2, fun=mean)  + ggtitle('42.5.1. options read out loud?'),
  question (surv, q1=372, q2=153, q3 = 2, fun=mean)  + ggtitle('42. Better prices'),
  question (surv, q1=372, q2=154, q3 = 2, fun=mean)  + ggtitle('42. Conservation of fish resource'),
  question (surv, q1=372, q2=155, q3 = 2, fun=mean)  + ggtitle('42. Support if low income'),
  question (surv, q1=372, q2=156, q3 = 2, fun=mean)  + ggtitle('42. Recognition as org member'),
  question (surv, q1=372, q2=157, q3 = 2, fun=mean)  + ggtitle('42. was pushed'),
  question (surv, q1=372, q2=158, q3 = 2, fun=mean)  + ggtitle('42. social aspects'),
  question (surv, q1=372, q2=159, q3 = 2, fun=mean)  + ggtitle('42. increase fishing & income'),
  question (surv, q1=372, q2=160, q3 = 2, fun=mean)  + ggtitle('42. Better quality of fishing products'),
  nrow=3, ncol=5
)

5. Demographics

Almost all our game participants were male fishermen, most of them married or in co-habitation with partner. The average age for the full sample is 42.3913043 and the median is 42. Our sample is younger in Taganga and older in Las Flores. Most people finished elementary school, over 25 respondents didn’t have any formal education, and less than 20 had university degrees almost all of them from Taganga (There is an error with 0 - datapoint in Tasajeras). Most respondants were born on the community but ~70 of them came from somewhere else (error data point with 10). About 40 people have been moving around while <200 have always live in the same community.


grid.arrange(
  question (surv, q1=372, q2=161, q3 = 2, fun=mean)  + ggtitle('43. Gender \n Female=0'),
  question (surv, q1=372, q2=162, q3 = 2, fun=mean)  + ggtitle('44. Status\n Single=1, married=2, sambo=3, div/widow=4'),
  question (surv, q1=372, q2=163, q3 = 2, fun=mean)  + ggtitle('45. Age'),
  question (surv, q1=372, q2=164, q3 = 2, fun=mean)  + ggtitle('46. Formal education\n none=1, elementary=2, high=3, uni=4'),
  question (surv, q1=372, q2=165, q3 = 2, fun=mean)  + ggtitle('46.1 Years of education'), # error on coding
  question (surv, q1=372, q2=166, q3 = 2, fun=mean)  + ggtitle('47. Born here?'), # error non binary
  question (surv, q1=372, q2=168, q3 = 2, fun=mean)  + ggtitle('49. Since when do you live here?'), # error coding
  question (surv, q1=372, q2=169, q3 = 2, fun=mean)  + ggtitle('50. Have you always lived here?'),
  question (surv, q1=372, q2=170, q3 = 2, fun=mean)  + ggtitle('51. How long have you\n been living here?'), #error coding
  question (surv, q1=372, q2=171, q3 = 2, fun=mean)  + ggtitle('51. Displaced?'), # error coding summary(surv[,171])

  nrow=2, ncol=6
)

When asking if they have suffered forced displacement the answers were coding on diverse formats and we cannot plot the info at the moment. However, few respondents (n=5) report forced displacement from 6 months up to 7 years. Question 52 was designed to assess sense of place. Each participant was asked to agree or disagree on a scale from 1 to 4 (1=strongly agree, 2=agree, 3=disagree, 4 strongly disagree) with statements that were read outloud for them. The figure below summarize their answers per location but I’m not quite sure how to interpret them.

grid.arrange(
  
  question (surv, q1=372, q2=172, q3 = 2, fun=mean)  + ggtitle('52.1 Miss the place'),
  question (surv, q1=372, q2=173, q3 = 2, fun=mean)  + ggtitle('52.2 Not belonging'),
  question (surv, q1=372, q2=174, q3 = 2, fun=mean)  + ggtitle('52.3 Safety'),
  question (surv, q1=372, q2=175, q3 = 2, fun=mean)  + ggtitle('52.4 Proud'),
  question (surv, q1=372, q2=176, q3 = 2, fun=mean)  + ggtitle('52.5 Part of me'),
  question (surv, q1=372, q2=177, q3 = 2, fun=mean)  + ggtitle('52.6 go away'),
  question (surv, q1=372, q2=178, q3 = 2, fun=mean)  + ggtitle('52.7 engaged'),
  question (surv, q1=372, q2=179, q3 = 2, fun=mean)  + ggtitle('52.8 my roots'),
  question (surv, q1=372, q2=180, q3 = 2, fun=mean)  + ggtitle('52.9 family and friends'),
  nrow=3, ncol=3
)

The game

Now I turn your attention to the game data. On groups of 4 we played with the 256 fishermen a fishing game, where each person were supposed to make a decision on how much to fish. Each group of 4 fishermen shared a common fish stock that started with 50 fish. Their decisions were private but they could communicate throghout the game. The game lasted 16 rounds unless they collapsed their resource, of which 6 rounds where for learning the game (‘base line within treatments’) and another 10 rounds they had to face one of four possible situations or treatments. Base line groups continued playing the same game as the first 6 rounds. Threshold treatment players were informed that a climate event has occurred decreasing the reproduction rate of the resource. Risk treatment players faced a similar situation but the event was not certain, they knew that the probability of the event happening was of 50% chance. Uncertainty treatment also faced the possibility of a climate event on the second part of the game but players didn’t know the actual probability of the event. The figure below shows a summary of all time lines of the game per place and per treatment. The blue line shows the mean of the time series while the shaded area shows the smooth, this is the Gaussian confidence intervals based on the t-distribution (so we assume normality). The red line marks round 6, this is the end of the first part of the game. It helps to caution the reader that in round 7 the treatment started and we reset the fish stock size to 50 as in the beginning of the game, that’s why in all timelines there is a ‘jump’ from round 6 to 7.

dat <- read.csv(file="~/Dropbox/BEST/Colombia/0_Game data/160427_corrected_full_data_long.csv", row.names=1)

g <- ggplot(dat=dat, aes(x=Round, y=value)) + 
        geom_vline( aes( xintercept=6, color='red', alpha=0.1), show.legend = F) + 
        stat_summary(fun.data='mean_cl_normal', geom='smooth') + # option 'mean_cl_boot I like the most but normal assumes normality
        facet_grid(Treatment ~ Place) 
g

Now imagine each individual is exploring the parameter space of the stock size at the beginning of each round (which depends on what the group did on the previous round), and her/his own decision. Each individual will then follow a trajectory on that phase diagram that potentially capture a signature of her/his strategy. The figure below shows such trajectory, where each line represents the time series of each person for the afternoon session in Las Flores on February 4th. We played simultaneously with 16 players, 4 in each of the 4 above mentioned treatments. Here we see that the base line group collapsed twice, once on round 5 and another in round 11; this is why we see the line broken and never reaching the light blue colour that denotes the end of the game. Here it also comes clear that, for example, people who played the threshold treatment collectively chose to stay on high levels of the resource by fishing maximum 4 units and maintaining their resource above 38 fishes.

datP1 <- dat %>%
  filter(dat$Place == 'Las Flores' & dat$Session == 'pm' & Date == "2016-02-04" ) %>%
  gdata::drop.levels()

p <- ggplot(data=datP1, aes(x=value, y=StockSizeBegining), group=Player) + 
        geom_path(aes(color=Round)) + facet_grid(Treatment~Player) +
        ggtitle('Phase space trajectory per player\n Las Flores, 2016-02-04-pm ') + 
        theme(text= element_text(family='Helvetica', size=9))

p

At the group level, the same rationale can be applied if one plots the stock size at the begining of each round against the sum of the total catch for all 4 participants per group. Note that we have in total 64 groups (4 treatments * 4 sessions * 4 locations). The figure below summarizes all 64 groups, each one represented with a colour, and each colour showing the contour or the parameter space where most points were concentrated by location and treatment. The contour was calculated with a 2-dimensional smoother (bandwidth = 15). The figure on the left shows the aggregated group behaviour by ploting the sum of the total catch. The figure on the right shows the resource dynamics at the beginning of each round vs. the resource at the end of the round which is the same as the new stock size for the next round, this is what people fish plus the reproduction in the round. In dynamical systems this is equivalent to \(f(x[t+1]):x[t]\).

c <- ggplot(data=dat, aes(y=StockSizeBegining, x=SumTotalCatch), group=Session) + 
  stat_density_2d(aes(color=group, alpha=0.5), n=100, h=15, show.legend = F) + 
  facet_grid( Treatment ~ Place)+ 
    theme(text= element_text(family='Helvetica', size=9))
p <- ggplot(data=dat, aes(y=NewStockSize, x=StockSizeBegining), group=Session) + 
  geom_abline(aes(color='black', alpha=0.5),slope=1, intercept = 0) +
  stat_density_2d(aes(color=group, alpha=0.3), n=100, h=15, show.legend = F) + 
  geom_point(aes(color=group, alpha=0.3), show.legend=F) + 
  facet_grid( Treatment ~ Place)+ 
    theme(text= element_text(family='Helvetica', size=9))

grid.arrange ( c,p, ncol=2, nrow=1 )

The next natural question to ask is whether there is an effect between:

  • The first and the second part of the game
  • Treatments
  • Place

The four panels below try to get to these differences

pm1 <- ggpairs(data= filter(dat, dat$part == F), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Place, alpha=0.5), title='Color by place, first part') 

pm2 <- ggpairs(data= filter(dat, dat$part == T), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Place, alpha=0.5), title='Color by place, second part') 

pm3 <- ggpairs(data= filter(dat, dat$part == F), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Treatment, alpha=0.5), title='Color by treatment, second part') 

pm4 <- ggpairs(data= filter(dat, dat$part == T), 
              columns=c('StockSizeBegining', 'NewStockSize', 'Treatment','Place'), 
              upper= list(continuous='density'), lower=list(continuous='points'), 
              mapping=aes(color= Treatment, alpha=0.5), title='Color by treatment, second part') 

#grid.arrange(
  pm1   + theme(text= element_text(family='Helvetica', size=6))
  pm2   + theme(text= element_text(family='Helvetica', size=6))
  pm3   + theme(text= element_text(family='Helvetica', size=6))
  pm4   + theme(text= element_text(family='Helvetica', size=6))
#  ncol=2, nrow=2
#)

Something I’d really like to get into (for modeling purposes) is identifying fishing strategies or styles. To do so I ran a PCA and MDS on the time series (16 rounds) of each of the 256 players in our dataset. The results are shown below.

library(vegan)
players <- (reshape::cast(dat, ID_player ~ Round))[,-1] #delete playersID
place <- reshape::cast(dat, ID_player ~  Place)[,-1] # 16 because calculate length of 'value'
group <- reshape::cast(dat, ID_player ~  group)[,-1]
treat <- reshape::cast(dat, ID_player ~  Treatment)[,-1]
context <- cbind(place,treat) # don't use group yet, maybe for aes

pca <- rda(players, context) #, context

# note: I ran mds with 'euclidean' distance and 'manhattan'. Euclidean gave an error saying there is not enough data. manhattan calculated but didn't reach convergence. Morisita, jaccard, kulczynski, horn, rau, don't reach convergence neither. Frequent error on different measures: Stress is (nearly) zero - you may have insufficient data. It does work however if I use the transpose. It calculates the ordering of rounds on mds$points, but the ordering of players can be found too at mds$species
mds <- metaMDS(players, dist= 'manhattan', trymax=400, autotransform=F, k=2)
## Run 0 stress 0.1683166 
## Run 1 stress 0.171572 
## Run 2 stress 0.1714471 
## Run 3 stress 0.1735953 
## Run 4 stress 0.1712011 
## Run 5 stress 0.1717959 
## Run 6 stress 0.1707512 
## Run 7 stress 0.1701955 
## Run 8 stress 0.1732548 
## Run 9 stress 0.1717578 
## Run 10 stress 0.1757784 
## Run 11 stress 0.1707826 
## Run 12 stress 0.1711042 
## Run 13 stress 0.1748702 
## Run 14 stress 0.1748003 
## Run 15 stress 0.1727643 
## Run 16 stress 0.1719371 
## Run 17 stress 0.1717556 
## Run 18 stress 0.1693138 
## Run 19 stress 0.1696573 
## Run 20 stress 0.1687521 
## ... procrustes: rmse 0.009410413  max resid 0.1008896 
## Run 21 stress 0.1728164 
## Run 22 stress 0.1735087 
## Run 23 stress 0.1691987 
## Run 24 stress 0.1705527 
## Run 25 stress 0.1718176 
## Run 26 stress 0.1709458 
## Run 27 stress 0.1718148 
## Run 28 stress 0.175412 
## Run 29 stress 0.1716077 
## Run 30 stress 0.1701926 
## Run 31 stress 0.1709276 
## Run 32 stress 0.1762781 
## Run 33 stress 0.1698691 
## Run 34 stress 0.1751624 
## Run 35 stress 0.1715244 
## Run 36 stress 0.170606 
## Run 37 stress 0.1701235 
## Run 38 stress 0.1683964 
## ... procrustes: rmse 0.003925691  max resid 0.05692734 
## Run 39 stress 0.1715622 
## Run 40 stress 0.1686561 
## ... procrustes: rmse 0.009411931  max resid 0.1008758 
## Run 41 stress 0.1700099 
## Run 42 stress 0.1719182 
## Run 43 stress 0.1711574 
## Run 44 stress 0.1765374 
## Run 45 stress 0.169963 
## Run 46 stress 0.1704755 
## Run 47 stress 0.1709801 
## Run 48 stress 0.1739684 
## Run 49 stress 0.1737266 
## Run 50 stress 0.1731897 
## Run 51 stress 0.1698436 
## Run 52 stress 0.1713516 
## Run 53 stress 0.1760974 
## Run 54 stress 0.1705239 
## Run 55 stress 0.1734623 
## Run 56 stress 0.175865 
## Run 57 stress 0.1713336 
## Run 58 stress 0.1688796 
## Run 59 stress 0.1706529 
## Run 60 stress 0.1779502 
## Run 61 stress 0.1709459 
## Run 62 stress 0.1700121 
## Run 63 stress 0.1750217 
## Run 64 stress 0.1726188 
## Run 65 stress 0.1717318 
## Run 66 stress 0.1718669 
## Run 67 stress 0.1687753 
## ... procrustes: rmse 0.01004092  max resid 0.1020805 
## Run 68 stress 0.1725441 
## Run 69 stress 0.1710226 
## Run 70 stress 0.1749877 
## Run 71 stress 0.1714414 
## Run 72 stress 0.1716437 
## Run 73 stress 0.1752802 
## Run 74 stress 0.1754575 
## Run 75 stress 0.1785629 
## Run 76 stress 0.1713316 
## Run 77 stress 0.1706076 
## Run 78 stress 0.1709652 
## Run 79 stress 0.1732354 
## Run 80 stress 0.1724281 
## Run 81 stress 0.1733001 
## Run 82 stress 0.1717478 
## Run 83 stress 0.1762748 
## Run 84 stress 0.1735049 
## Run 85 stress 0.1755581 
## Run 86 stress 0.1712302 
## Run 87 stress 0.1720056 
## Run 88 stress 0.1721423 
## Run 89 stress 0.1697088 
## Run 90 stress 0.1778178 
## Run 91 stress 0.1758342 
## Run 92 stress 0.1704627 
## Run 93 stress 0.16878 
## ... procrustes: rmse 0.006714163  max resid 0.07920095 
## Run 94 stress 0.1700376 
## Run 95 stress 0.1703483 
## Run 96 stress 0.1762553 
## Run 97 stress 0.1703782 
## Run 98 stress 0.1714041 
## Run 99 stress 0.1734123 
## Run 100 stress 0.1704568 
## Run 101 stress 0.1693654 
## Run 102 stress 0.1723379 
## Run 103 stress 0.1710759 
## Run 104 stress 0.1694431 
## Run 105 stress 0.1713449 
## Run 106 stress 0.1711941 
## Run 107 stress 0.1738135 
## Run 108 stress 0.175471 
## Run 109 stress 0.1699958 
## Run 110 stress 0.172754 
## Run 111 stress 0.1746961 
## Run 112 stress 0.1717341 
## Run 113 stress 0.1698176 
## Run 114 stress 0.17569 
## Run 115 stress 0.1732747 
## Run 116 stress 0.1705851 
## Run 117 stress 0.1695978 
## Run 118 stress 0.1776777 
## Run 119 stress 0.1758886 
## Run 120 stress 0.1716789 
## Run 121 stress 0.1699488 
## Run 122 stress 0.175521 
## Run 123 stress 0.1684371 
## ... procrustes: rmse 0.005975167  max resid 0.06980982 
## Run 124 stress 0.1750233 
## Run 125 stress 0.1717374 
## Run 126 stress 0.1742467 
## Run 127 stress 0.1711864 
## Run 128 stress 0.1738349 
## Run 129 stress 0.1709993 
## Run 130 stress 0.1726651 
## Run 131 stress 0.170663 
## Run 132 stress 0.1727261 
## Run 133 stress 0.1702989 
## Run 134 stress 0.1718024 
## Run 135 stress 0.1694073 
## Run 136 stress 0.1740903 
## Run 137 stress 0.1707865 
## Run 138 stress 0.1696736 
## Run 139 stress 0.1694712 
## Run 140 stress 0.1710528 
## Run 141 stress 0.1707964 
## Run 142 stress 0.1730097 
## Run 143 stress 0.1700545 
## Run 144 stress 0.1691024 
## Run 145 stress 0.1715269 
## Run 146 stress 0.1713792 
## Run 147 stress 0.170747 
## Run 148 stress 0.1696465 
## Run 149 stress 0.1699109 
## Run 150 stress 0.1755789 
## Run 151 stress 0.173596 
## Run 152 stress 0.1696874 
## Run 153 stress 0.1700338 
## Run 154 stress 0.1860472 
## Run 155 stress 0.1706746 
## Run 156 stress 0.1739421 
## Run 157 stress 0.1705591 
## Run 158 stress 0.1687303 
## ... procrustes: rmse 0.008008199  max resid 0.08758061 
## Run 159 stress 0.171444 
## Run 160 stress 0.1782063 
## Run 161 stress 0.1709503 
## Run 162 stress 0.1738403 
## Run 163 stress 0.1759767 
## Run 164 stress 0.1705602 
## Run 165 stress 0.1707804 
## Run 166 stress 0.16978 
## Run 167 stress 0.1697941 
## Run 168 stress 0.1724272 
## Run 169 stress 0.1723704 
## Run 170 stress 0.1685999 
## ... procrustes: rmse 0.006040378  max resid 0.08770845 
## Run 171 stress 0.1722351 
## Run 172 stress 0.1736071 
## Run 173 stress 0.1708744 
## Run 174 stress 0.1683412 
## ... procrustes: rmse 0.002224394  max resid 0.02930657 
## Run 175 stress 0.1746781 
## Run 176 stress 0.1733516 
## Run 177 stress 0.1801277 
## Run 178 stress 0.1745587 
## Run 179 stress 0.1727922 
## Run 180 stress 0.170691 
## Run 181 stress 0.1702412 
## Run 182 stress 0.1748476 
## Run 183 stress 0.1727885 
## Run 184 stress 0.1730955 
## Run 185 stress 0.1697159 
## Run 186 stress 0.169577 
## Run 187 stress 0.173188 
## Run 188 stress 0.1710129 
## Run 189 stress 0.1734143 
## Run 190 stress 0.1719864 
## Run 191 stress 0.1703731 
## Run 192 stress 0.1695943 
## Run 193 stress 0.1747641 
## Run 194 stress 0.17353 
## Run 195 stress 0.1704468 
## Run 196 stress 0.1701712 
## Run 197 stress 0.1713511 
## Run 198 stress 0.1716776 
## Run 199 stress 0.1707071 
## Run 200 stress 0.1725994 
## Run 201 stress 0.17117 
## Run 202 stress 0.1738242 
## Run 203 stress 0.1705217 
## Run 204 stress 0.1722856 
## Run 205 stress 0.1731751 
## Run 206 stress 0.1750901 
## Run 207 stress 0.1706368 
## Run 208 stress 0.1750053 
## Run 209 stress 0.171002 
## Run 210 stress 0.1725127 
## Run 211 stress 0.1727074 
## Run 212 stress 0.1712324 
## Run 213 stress 0.176584 
## Run 214 stress 0.1718324 
## Run 215 stress 0.1784117 
## Run 216 stress 0.1699832 
## Run 217 stress 0.1711527 
## Run 218 stress 0.1719506 
## Run 219 stress 0.1730311 
## Run 220 stress 0.1713321 
## Run 221 stress 0.1721117 
## Run 222 stress 0.1718074 
## Run 223 stress 0.1697433 
## Run 224 stress 0.1745713 
## Run 225 stress 0.168439 
## ... procrustes: rmse 0.005783452  max resid 0.07041921 
## Run 226 stress 0.1745692 
## Run 227 stress 0.169343 
## Run 228 stress 0.1729225 
## Run 229 stress 0.1725065 
## Run 230 stress 0.1708802 
## Run 231 stress 0.1722497 
## Run 232 stress 0.1718353 
## Run 233 stress 0.1739463 
## Run 234 stress 0.1751717 
## Run 235 stress 0.1727417 
## Run 236 stress 0.1729624 
## Run 237 stress 0.1734059 
## Run 238 stress 0.1684592 
## ... procrustes: rmse 0.004848654  max resid 0.05515583 
## Run 239 stress 0.17053 
## Run 240 stress 0.1696726 
## Run 241 stress 0.16968 
## Run 242 stress 0.1738909 
## Run 243 stress 0.1705498 
## Run 244 stress 0.1728521 
## Run 245 stress 0.1729706 
## Run 246 stress 0.1714049 
## Run 247 stress 0.1720863 
## Run 248 stress 0.1721538 
## Run 249 stress 0.1687643 
## ... procrustes: rmse 0.009009771  max resid 0.1019116 
## Run 250 stress 0.1707099 
## Run 251 stress 0.1727474 
## Run 252 stress 0.1737973 
## Run 253 stress 0.1707094 
## Run 254 stress 0.171472 
## Run 255 stress 0.1704324 
## Run 256 stress 0.1709252 
## Run 257 stress 0.1706167 
## Run 258 stress 0.1736351 
## Run 259 stress 0.1715554 
## Run 260 stress 0.1700837 
## Run 261 stress 0.1704744 
## Run 262 stress 0.1738413 
## Run 263 stress 0.1705251 
## Run 264 stress 0.1702509 
## Run 265 stress 0.1719673 
## Run 266 stress 0.1700102 
## Run 267 stress 0.1727691 
## Run 268 stress 0.1732018 
## Run 269 stress 0.1722314 
## Run 270 stress 0.1773607 
## Run 271 stress 0.1709964 
## Run 272 stress 0.1713967 
## Run 273 stress 0.1684144 
## ... procrustes: rmse 0.004181967  max resid 0.05420549 
## Run 274 stress 0.1699394 
## Run 275 stress 0.1719447 
## Run 276 stress 0.1702144 
## Run 277 stress 0.1711456 
## Run 278 stress 0.1734492 
## Run 279 stress 0.1706147 
## Run 280 stress 0.1781211 
## Run 281 stress 0.1710896 
## Run 282 stress 0.1725406 
## Run 283 stress 0.1715357 
## Run 284 stress 0.1709675 
## Run 285 stress 0.1727341 
## Run 286 stress 0.1723682 
## Run 287 stress 0.1733533 
## Run 288 stress 0.1725974 
## Run 289 stress 0.1726523 
## Run 290 stress 0.1697946 
## Run 291 stress 0.1717886 
## Run 292 stress 0.1712665 
## Run 293 stress 0.1702697 
## Run 294 stress 0.1763435 
## Run 295 stress 0.1696651 
## Run 296 stress 0.1747775 
## Run 297 stress 0.1726211 
## Run 298 stress 0.16976 
## Run 299 stress 0.1758223 
## Run 300 stress 0.1717487 
## Run 301 stress 0.173585 
## Run 302 stress 0.1716382 
## Run 303 stress 0.1720001 
## Run 304 stress 0.1696953 
## Run 305 stress 0.1847525 
## Run 306 stress 0.1699995 
## Run 307 stress 0.169961 
## Run 308 stress 0.171002 
## Run 309 stress 0.1707732 
## Run 310 stress 0.1752688 
## Run 311 stress 0.1714982 
## Run 312 stress 0.1757249 
## Run 313 stress 0.1712139 
## Run 314 stress 0.1709326 
## Run 315 stress 0.1728986 
## Run 316 stress 0.1709186 
## Run 317 stress 0.1714718 
## Run 318 stress 0.1706675 
## Run 319 stress 0.1726971 
## Run 320 stress 0.1712829 
## Run 321 stress 0.1753119 
## Run 322 stress 0.1684296 
## ... procrustes: rmse 0.007486056  max resid 0.1010493 
## Run 323 stress 0.1717712 
## Run 324 stress 0.1730014 
## Run 325 stress 0.1690431 
## Run 326 stress 0.1774275 
## Run 327 stress 0.172606 
## Run 328 stress 0.1713407 
## Run 329 stress 0.1724874 
## Run 330 stress 0.1728318 
## Run 331 stress 0.1716664 
## Run 332 stress 0.1761918 
## Run 333 stress 0.1733311 
## Run 334 stress 0.1744288 
## Run 335 stress 0.1713487 
## Run 336 stress 0.1740721 
## Run 337 stress 0.1737278 
## Run 338 stress 0.1718268 
## Run 339 stress 0.1723074 
## Run 340 stress 0.1704185 
## Run 341 stress 0.1711384 
## Run 342 stress 0.1714642 
## Run 343 stress 0.1745121 
## Run 344 stress 0.1744748 
## Run 345 stress 0.1707797 
## Run 346 stress 0.1724158 
## Run 347 stress 0.1748373 
## Run 348 stress 0.1709706 
## Run 349 stress 0.1744717 
## Run 350 stress 0.1714946 
## Run 351 stress 0.1684443 
## ... procrustes: rmse 0.00543619  max resid 0.0602239 
## Run 352 stress 0.1718903 
## Run 353 stress 0.1722507 
## Run 354 stress 0.1696675 
## Run 355 stress 0.170907 
## Run 356 stress 0.1777037 
## Run 357 stress 0.1727555 
## Run 358 stress 0.1719303 
## Run 359 stress 0.1713681 
## Run 360 stress 0.1727342 
## Run 361 stress 0.1769285 
## Run 362 stress 0.1720814 
## Run 363 stress 0.1712327 
## Run 364 stress 0.1704312 
## Run 365 stress 0.170544 
## Run 366 stress 0.1699483 
## Run 367 stress 0.1697799 
## Run 368 stress 0.1714528 
## Run 369 stress 0.1701648 
## Run 370 stress 0.1724719 
## Run 371 stress 0.1775749 
## Run 372 stress 0.1724426 
## Run 373 stress 0.1683825 
## ... procrustes: rmse 0.003475691  max resid 0.04810693 
## Run 374 stress 0.1712596 
## Run 375 stress 0.1713231 
## Run 376 stress 0.1743225 
## Run 377 stress 0.1720856 
## Run 378 stress 0.1735691 
## Run 379 stress 0.1699372 
## Run 380 stress 0.1695891 
## Run 381 stress 0.1723273 
## Run 382 stress 0.1704295 
## Run 383 stress 0.1683452 
## ... procrustes: rmse 0.006513576  max resid 0.1009568 
## Run 384 stress 0.1709782 
## Run 385 stress 0.1702082 
## Run 386 stress 0.173587 
## Run 387 stress 0.1699878 
## Run 388 stress 0.1707655 
## Run 389 stress 0.171516 
## Run 390 stress 0.1711221 
## Run 391 stress 0.1709217 
## Run 392 stress 0.1701296 
## Run 393 stress 0.1692925 
## Run 394 stress 0.1715386 
## Run 395 stress 0.1719761 
## Run 396 stress 0.1727074 
## Run 397 stress 0.1718885 
## Run 398 stress 0.1715797 
## Run 399 stress 0.1715553 
## Run 400 stress 0.1738439
# mds2 <- metaMDS(players, dist= 'mahalanobis', trymax=1000, autotransform=F, k=3, previous.best = mds) # solution from http://stackoverflow.com/questions/14434794/no-stable-solution-using-metamds-in-vegan
ef1 <- envfit(mds, context, permu=999)

par(mfrow=c(1,3))
  plot(pca)
  plot(mds, type='p', display=c('sites', 'species'), cex=0.8)
  plot(ef1, p.max=0.05, col='blue', cex=0.8)
  stressplot(mds)
    

To-Do

  1. ggpairs plots for treatments, place, and before/after
  2. statistics: t-test, Mann-Witness test, Fisher exact test for all
  3. some simple regressions
  4. styles clustering
g <- ggplot(dat, aes(x=value, y=StockSizeBegining,color=Treatment )) +
        geom_point() +
        geom_smooth( method='loess') +
        ggtitle('Local polynomial regression\n Stock size vs Decision')
g